On the Number of Elements Needed in a POVM 
Attaining the Accessible Information 

o 

O ■ Peter W. Shor 

AT&T Labs— Research 
C^: Florham Park, NJ 07932, USA 

£/3 ' shor@research.att.com 



G\ 



x 



February 1, 2008 



Keywords: Quantum measurement; Accessible information; POVM's 

l_ ' Abstract 

^\ ' We investigate an symmetric set of three quantum states in three di- 

mensions with interesting properties, which we call the lifted trine states. 

^^ ' We show that for the ensemble consisting of the three lifted trine states 

taken with equal probabilities, the POVM measurement realizing the ac- 
cessible information must contain six projectors, giving a counter-example 
O ,' to a conjecture of Levitin. 

Accessible information was one of the first information-theoretic quantities 
~* ' investigated with respect to quantum systems. The accessible information of 

^ | an ensemble of quantum states is the maximum mutual information obtainable 

O" 1 between the states of the ensemble and the outcomes of a POVM (positive op- 

erator valued measurement) on these states. In this paper, we investigate how 
complicated a measurement which achieves the accessible information must be. 

Davies' theorem gives a maximum on the number of elements of the POVM 

!-h .... 

jrt ' needed to attain the accessible information; namely, if the ensemble being con- 

sidered is contained in a d-dimensional Hilbert space, then at most d 2 elements 
arc needed in an optimal POVM. When all the states are real, this bound can 
be improved to d(d — l)/2 []7|. C. Fuchs and A. Peres Q] have done numerical 
studies on ensembles containing only two elements. They found no examples 
where more than d states were needed; that is, they found that the optimal 
measurement could always be a von Neumann measurement. In two dimen- 
sions, this was proved by Levitin H, who also conjectured that in d dimensions, 
if the number of quantum states in the ensemble is at most d, a von Neumann 
measurement is sufficient to attain the accessible information. In this paper, 
we give an ensemble of three real quantum states in three dimensions, where a 
POVM attaining the accessible information must contain at least six elements, 
the maximum by Davies' theorem. 



We investigate the accessible information of an ensemble consisting of three 
quantum states we call the lifted trine states, with equal probabilities on these 
states. The lifted trine states are obtained by starting with the two-dimensional 
quantum trine states: (1, 0), (—1/2, \/3/2), (—1/2, — V3/2), introduced by Holevo 
[|| and later studied by Peres and Wootters J6) . We add a third dimension to the 
Hilbert space of the trine states, and lift all of the trine states out the plane into 
this dimension by an angle of arcsin -^/a, so the states become (\/l — a, 0, y/a), 
and so forth. We will be dealing with small a (roughly, a < .1), so that they 
are close to being planar. This is the most interesting regime. When the trine 
states are lifted further out of the plane, they start behaving in relatively un- 
interesting ways until they are close to being vertical; then they start being 
interesting again, but this second regime is beyond the scope of this paper. The 
lifted trine states are thus: 



T (a) = (VI - a, 0, y/a) 

Ii(a) = HVT - ^^VT - o~Va) (1) 

T 2 (a) = (-i\/l-Q!,-^Vl-a,v^) 

When it is clear what a is, we may drop it from the notation and use To, Ti, 
and T<z. 

In this section, we find the accessible information for this ensemble of lifted 
trine states. The accessible information is defined as the maximal mutual infor- 
mation between the trine states (with probabilities * each) and the elements of 
a POVM measuring these states. Because the lifted trine states are real vectors, 
it follows from the version of Davies' theorem for real states |7J] that there is 
an optimal POVM with at most six elements, all the components of which are 
real. The lifted trine states are three-fold symmetric, so by symmetrizing we 
can assume that the optimal POVM is three-fold symmetric (possibly at the 
cost of introducing extra POVM elements). Also, the optimal POVM can be 
taken to have one-dimensional elements E, so the elements can be described as 
vectors | Vi) where Ei — \ Vi) (vi |. This means that there is an optimal POVM 
whose vectors come in triples of the form: y/p~P (<j), 0), y/pPi(<j), 9), v /pP 2 (^ ) , 9), 
where p is a scalar probability and 

Pq((/),9) = (cos</>cos#, cos</>sin#, sin0) 

Pi(4>,9) = (cos0cos(6> + 27r/3),cos0sm(6> + 27r/3),sin0) (2) 

P2(<p,9) — (cos <p cos(9 — 27r/3), cos (f>s'm(6 — 2tt/3), sin <f). 

Suppose that the optimal POVM has several such triples, which we call 
y/plP b ((t)i,9i), y/p2Pb(4>2,9 2 ), ..., ^/]h7iPb{4> m ,9m)- It is easily seen that the 
conditions for this set of vectors to be a POVM are that 

rn m 

^ Pi sin 2 (^j) = 1/3 and ^p l = \. (3) 

The formula for accessible information I a can be broken into pieces so that each 
triple contributes a linear amount to I a- That is, I a is the weighted average 



(weighted according to pi) of some contribution I(4>,9) from each (cf>,9). To 
show this, recall that I a is the mutual information between the input and the 
output, and this can be expressed as the entropy of the input less the entropy 
of the input given the output, H(X- m ) — H(Xi n \X out ). The term H(Xi n \X out ) 
naturally decomposes into terms corresponding to the various POVM outcomes, 
and there are several ways of assigning the entropy of the input H(X- m ) to the 
various POVM elements in order to complete this decomposition. Following 
this analysis eventually gives the same answer as is obtained below (and is in 
fact how I arrived at it). I briefly sketch this analysis so as to give the intuition 
behind it, and then go into detail in a second analysis, which is superior in that 
it explains the form of the answer. 

For each <fi, and each a, there is a 8 that optimizes I(4>, 9). This 9 starts out 
at 7r/6 for = 0, decreases until it hits at some value of (p (which depends 
on a), and stays at until </> reaches its maximum value of 7r/2. For a fixed 
a, by finding (numerically) the optimal value of 9 for each <fi and using it to 
obtain the contribution to I a attributable to that 4>, we get a curve giving the 
optimal contribution to I a for each (f>. If this curve is plotted, with the x-value 
being sin 2 (f> and the y- value being the contribution to I a, an optimal POVM is 
obtained from the set of points on this curve whose average x- value is 1/3 (from 
Eq. ||), and whose average y- value is as large as possible given this constraint 
on the x-values. A simple convexity argument shows that we only need at most 
two points from the curve to obtain this optimum, and that we will need one 
or two points depending on whether the relevant part of the curve is concave or 
convex. For small a, it turns out that the relevant piece of the curve is convex, 
and we need two tfi's to achieve the maximum. Each of these c/Vs corresponds 
to a triple of POVM elements. One of the (<fi, 8) pairs is (0, 7r/6), and the other 
is (0 Q ,O) for some <p a > arcsin(l/\/3). The formula for this <j) a will be derived 
later. 

The analysis in the remainder of this section shows that this six-outcome 
optimal POVM can be described in a different way, which unifies the optimal 
measurements for the different a's. For small a (a < 71 for some constant 71), 
we first take the trine Tb(a) and make a partial measurement which cither 
projects it down to the x, y plane or lifts it further out of the plane so that it 
becomes the trine 1], (71). (Note that 71 is independent of a.) If the trine was 
projected into the x, y plane, we make a second measurement using the POVM 
with outcome vectors y/2j?>({), 1) and \/ty3(±V3/2,-l/2). This is the optimal 
POVM for trines in the x, y-plane. If the trine was lifted up, we use the von 
Neumann measurement that projects onto the basis containing (-\/2/3, 0, yl/3) 
and ( — -y/l/6, zh-^/l/2, y/l/3). If a is larger than 71 (but still smaller than 8/9) 
we skip the first partial measurement, and just use the above von Neumann 
measurement. Here, 71 is obtained by numerically solving a fairly complicated 
equation; we suspect that no closed form expression for 71 exists. The value of 
71 is .061367, which is sin 2 4> for 4> = .25033 radians (14.343°). 

We now give more details on this decomposition of the POVM into a two- 
step process. We first apply a partial measurement which does not extract all 



of the quantum information, i.e., it leaves a quantum residual state that is not 
completely determined by the measurement outcome. Formally, we apply one 
of a set of matrices Ai satisfying JT A\Ai = !• If we start with a pure state 
\v), we observe the i'th outcome with probability (v \ A\Ai \v), and in this case 
the state | v) is taken to the state Ai \ v). For our purposes, we choose as the 
Ai's the matrices -JpiM(fa) where 



M(fa) 



/ 



V 










\ 







sir 


"f> J 



(4) 



The y/piM{<j>i) will form a valid partial measurement if and only if ^2 t pi sin 2 (fa) 
— 1/3 and Y^iPi = I' tne same conditions [Eq. (||)] as for the Pf,(fa,6i). By 
first applying the above y/pi M (fa) , and then applying the von Neumann mea- 
surement with the three basis vectors 



v (6) = 

vm = 

V 2 (8) = 



cos(9), yfl gm(0), ^) 



cos(6> + 2tt/3), ^/|sin(6> + 2tt/3), ^=) 
| cos((9 - 2tt/3), */§ sin(9 - 2tt/3), ^=\ 



(5) 



we obtain the POVM given by the vectors y/piPb(0i, 4>i)\ checking this is simply 
a matter of verifying that Vb(8)M(4>) — Pb(0, fa). Now, after applying yfpl M(fa) 
to the trine To (a), we get the vector 



(\/3/2-\/l - a y/pi cos fa, 0, V?>yfay/pi sin fa) . 
This is just the state -v/p[To(a^) where To(a^) is the trine state with 



(6) 



and 



a sin 



fa + |(1- a)cos 2 <^ 



asm' 



Pi 



3pi [■ 



asm 



+ i(l - a) cos 2 



(7) 



(8) 



is the probability that we observe this trine state, given that we started with 
Tq (a). Similar formulae hold for the trine states Ti and T%. We compute that 



^pWi = ^2 3pi& sin 2 (fa) 



a. 



(9) 



Also notice that the first stage of this process, the partial measurement which 
applies the matrices y/pi M (fa), reveals no information about which of To, T\, T2 
that we started with. Thus, by the chain rule for classical Shannon information 
[pi, the accessible information obtained by our two-stage measurement is just 



the weighted average (the weights being p'J of the maximum over 9 of the 
Shannon mutual information I a > (9) between the outcome of the von Neumann 
measurement V(9) and the trines T(a' i ). By convexity, it suffices to use only 
two values of a\ to obtain this maximum. In fact, the optimum is obtained 
using either one or two values of o! i depending on whether the function 

I a i = max I a i (9) 

e 

is concave or convex over the appropriate region. In the remainder of this 
section, we give the results of computing (numerically) the values of this function 
I a i , and we show that for small enough a it is convex, so that we need two values 
of a' . We will then show that obtaining this maximum requires a POVM with 
six outcomes. 

We need to calculate the Shannon capacity of the channel whose input is 
one of the three trine states T(a'), and whose output is determined by the von 
Neumann measurement V(9). Because of the symmetry, we can calculate this 
using only the first projector Vq. The Shannon mutual information between the 
input and the output is H{X- ln ) — H{X{ n \X oni ), which is 

2 

I a , = log 2 3 + ^(MOWbia')) 2 log ((V (9)\T b (a')) 2 ) . (10) 

b=0 

We compute that the 9 giving the maximum I' a is it/ 6 when a' = 0, decreases 
continuously to at oi = .056651 and remains for larger a'. (See Fig. |[) 
This value .056651 corresponds to an angle of .24032 radians (13.769°). This 9 
was determined by using the computer package Maple to numerically find the 
point at which dl a (9)/d9 = 0. 

By plugging this optimum 9 into the formula for I a i , we obtain the optimum 
von Neumann measurement of the form V above. We believe that this is also 
the optimal generic von Neumann measurement, but we have not proved this. 
The maximum of I a > (9) over 9, and curves that show the behavior of I a i (9) 
for constant 0, are plotted in Fig. g. We can now observe that the first part 
of the curve is convex, and thus that for small a the best POVM will have 
six projectors, corresponding to two values of a' . We calculate that for trine 
states with a < .061367, the two values of a' giving the maximum accessible 
information are and .061367; we will let 71 = .061367 be this second value. 
The trine states T(7i) make an angle of .25033 radians (14.343°) with the x-y 
plane. The accessible information thus obtained is plotted in Fig. [|. 

We can now invert the formula for a' (Eq. M) to obtain a formula for sin 2 (</>), 
and substitute the value of a' = 71 back into the formula to obtain the optimal 
POVM. We find 

• 2/ , n I- a 

sin (0 n ) = — 



1 -a 
1 + 29.591a 



(11) 



where 71 = .061367 as above. Thus, the elements in the optimal POVM we 
have found for the trines T(a), when a < 71, are the six vectors Pb(<ft a , 0) and 
Pb(0,7r/6), where 4>a is given by Eq. [ll] and b — 0,1,2. We must also prove 
there are no other POVM's which attain the same accessible information. The 
argument above shows that any optimal POVM must contain only projectors 
chosen from these six vectors: only those two values of a' can give the maximum 
capacity, and for each of these values of a' there are only three projectors in 
V(0) which can maximize I a i for these a! '. It is easy to check that there is only 
one set of probabilities pi which make the above six vectors into a POVM, and 
that none of these probabilities are for < a < 71. Thus, for the lifted trine 
states with < a < .061367, there is only one POVM maximizing accessible 
information, and it contains six elements, the maximum possible for real states 
by a generalization of Davics' theorem M. 
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Figure 1: The value of 9 maximizing I a for a between and .07. This function 
starts at 7r/6 at a = 0, decreases until it hits at a = .056651 and stays at 
for larger a. 



0.9 



0.8 



0.7 



0.6 



0.5 



0.4 



. / 

. 7 

1 



1 



0.01 0.02 0.03 0.04 0.05 0.06 0.07 

a 



Figure 2: This plot shows I a (8) for various 9. The dashed curves are I a {0) and 
I a (Tr/6). Note that 9 = is optimal for a > .056651 and 9 — ir/6 is optimal 
for a = 0. The dotted curves show I a (9) for 9 at intervals of 3° between and 
7r/6 = 30°. The solid curve shows I a {9 op t) for those a where neither nor n/6 
is the optimal 9. The solid curve is slightly convex; this is clearer in Fig. ||. 
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Figure 3: This graph contains three curves. The dashed curve is / a (0) and 
the dotted curve is the maximum over of I a (8) for a < .056651. The solid 
curve is the convex envelope of the other two curves. This solid curve is a 
linear interpolation between a — and a = .061367 and corresponds to a 
POVM having six elements. It gives the accessible information for the lifted 
trine states T(a) when < a < .061367. 



